-
Notifications
You must be signed in to change notification settings - Fork 142
feat(key on repo url): support git hosts other than GitHub + multiple forks #1043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for endearing-brigadeiros-63f9d0 canceled.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1043 +/- ##
==========================================
- Coverage 83.25% 82.77% -0.49%
==========================================
Files 59 66 +7
Lines 2449 2787 +338
Branches 280 335 +55
==========================================
+ Hits 2039 2307 +268
- Misses 366 433 +67
- Partials 44 47 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after an initial scan through :) thanks for your contribution!
Picked up a couple of test failures after merging main - will resolve (and start working on the additional tests needed). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the approval/rejection flows with a pre-existing repo, and things work well!
There is an issue with backwards compatibility with older, invalid databases from previous versions of GitProxy (unique URL enforcement with repos). This may also cause issues with the other files (pushes, users).
I also tested the Add Repo flow which caused my server to crash, maybe because of something wrong on my end (invalid input maybe?).
This comment was marked as resolved.
This comment was marked as resolved.
I think catching and displaying a simple error message with the invalid entry/entries could be enough - so that the GitProxy administrator can quickly identify the issue and fix it manually. Thankfully, the error seems to occur on backend (db) startup, so end users wouldn't really have the app suddenly blowing up. |
Signed-off-by: Raimund Hook <[email protected]>
…que as its our new primary key
Signed-off-by: Raimund Hook <[email protected]>
…te non-specific db fns
Signed-off-by: Raimund Hook <[email protected]>
Typescript wasn't working on the DB classes due to their dependency imports with require.
…making github fns optional
…file-based DB implementation
…file-based DB implementation
@andypols in the earlier iterations of this PR I was using the proxy middlewares directly in the Router and the only good way to add/remove them again was to restart. I ended up changing that to wrap them in a custom middleware which uses the proxy middlewares in the I'm loathe to take that on before we merge the current state, as I think restarting for a new origin is a low (but not no!) impact flaw and we could deal with it in a subsequent issue/PR. |
Fair enough — I can see that. It was just a question, not a criticism. From my (possibly overly simplistic) perspective, the proxy shouldn't care what the Git domain is. Happy to see it handled in a subsequent PR! |
No problem! You've just got me thinking (out loud).
One thing that lead to it is the express-http-proxy middleware taking the proxied domain as an argument - it will accept a function, but those can only rotate which host requests are proxied to as the function doesn't have the request itself in scope). I guess another approach would be to create the proxy middleware in response to each request; I didn't do that on the (untested!) assumption that it would could cause a performance bottleneck (e.g. by checking the proxied hosts in the DB on each request). However, that's likely a small penalty... |
…t-proxy into 950-key-on-repo-urls
…loyment-candidate-5
On failure to retrieve repository metadata as it will occur for all users on private projects.
@kriswest Thank you very much for your help. Although the issue hasn't been resolved yet, I've taken a look at some of the suggestions you provided.
I added a diff --git a/src/proxy/chain.ts b/src/proxy/chain.ts
index 2130519c..94446568 100644
--- a/src/proxy/chain.ts
+++ b/src/proxy/chain.ts
@@ -32,6 +32,9 @@ let pluginsInserted = false;
export const executeChain = async (req: any, res: any): Promise<Action> => {
let action: Action = {} as Action;
try {
+ console.log(req);
+ if (!req.body)
+ console.log("no req.body !!");
action = await proc.pre.parseAction(req);
const actionFns = await getChain(action);
@@ -54,6 +57,7 @@ export const executeChain = async (req: any, res: any): Promise<Action> => {
}
}
+ console.log(action);
return action;
};
Unfortunately, our company's firewall policy blocks push to github.com. I'll try testing from home later.
This issue also occurs in other projects and repositories.
I tried to merge PR 973 on this PR, but a conflict occurred. I'll likely need to try again after this PR and 973 are merged. |
Hi @kyet, hopefully 973 will be through review and merged shortly. I'll update this shortly after. However, I can't think of anything (in the PR) that would result in in Happy to look at the request and headers in case it tells us something. |
Conflicts resolved and ready for a another look. I haven't had a chance to test it yet, but the tests are all passing. |
I'm aware I haven't done anything in the documentation regarding this PR. That should probably be reviewed and work to add to the docs undertaken under a new issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I have a few comments - hope we can tackle these and get this PR ready to merge soon. 🚀
A few more things I was wondering:
What exactly are the "breaking" changes, and what are the steps an organization must follow to upgrade GitProxy to v2? I have a feeling that some of the issues I encountered might have been due to "bad data" - something that could be updated with a script to avoid errors in v2.
So two important action points for the v2 release:
- Documenting the breaking changes for both #973 and this PR
- Ideally automating the migration process for v1 -> v2 databases so GitProxy administrators don't need to do anything (and potentially mess up the upgrade process)
- If automating is not plausible, we should document likely problems and their solutions (for example: normalizing .git ending on repo URLs to prevent frontend display bugs)
.set('accept', 'application/x-git-upload-pack-request') | ||
.buffer(); | ||
|
||
res2.should.have.status(404); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is failing on my local setup:
1) proxy express application
should be restarted by the api and stop proxying requests for a host (e.g. gitlab.com) when the last project at that host is DELETED via the API:
AssertionError: expected Response{ domain: null, …(34), …(2) } to have status code 404 but got 200
+ expected - actual
-200
+404
at Context.<anonymous> (test/testProxyRoute.test.js:452:22)
at Generator.next (<anonymous>)
at fulfilled (test/testProxyRoute.test.js:5:58)
at processTicksAndRejections (node:internal/process/task_queues:105:5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a gitlab repo in your data already? The test is assuming you don't. Does the failure persist if you delete the .data
dir for the fileDB and re-run (and does it come back if you re-run)?
I'm able to run multiple times with the same data dir and do not get that failure.
... however, I note some oddities to the logging:
Restarting the proxy to remove a host
No plugins configured
Found 0 plugin modules
HTTP server closed
Initializing proxy router for origins: '["github.com"]'
setting up origin: 'github.com'
setting up catch-all route (github.com) for backwards compatibility
proxy keys registered: ["/github.com/"]
HTTP Proxy Listening on 8000
processing request URL: '/gitlab.com/gitlab-community/meta.git/info/refs?service=git-upload-pack'
proxy keys registered: ["/github.com/"]
using fallback
Action processed: Allowed
Request URL: /gitlab.com/gitlab-community/meta.git/info/refs?service=git-upload-pack
Host: 127.0.0.1:58153
User-Agent: git/2.42.0
Request resolved to https://github.com/gitlab.com/gitlab-community/meta.git/info/refs?service=git-upload-pack
✔ should be restarted by the api and stop proxying requests for a host (e.g. gitlab.com) when the last project at that host is DELETED via the API (694ms)
Looks like the request was allowed by the fallback. It should have been blocked. I think the 404 must be coming from GitHub and making the test pass my end. Not sure why it fails on yours, but clearly it needs a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the GitLab repo fixed this as well!
Might be nice to have before
and after
functions to prevent these - some developers might assume there's a problem in their code or the tests themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but I'm in two minds as to whether to we should clean up other repos on that domain or just fail out earlier and make it clear what the problem is... Possibly a case that runs first with a clear name like "Check that are no existing Gitlab repos in the database"?. Alternatively just prefix the relevant tests with an expect on a clearly named variable?
I can do either today hopefully, I just have to decide which way to go!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at these in more detail, I feel like the simplest choice is to make the reason for failure clear, since most contributors won't be testing with a pre-loaded database to begin with...
|
||
it('Proxy route helpers should return the proxied origin', async function () { | ||
const origins = await getAllProxiedHosts(); | ||
expect(origins).to.eql([TEST_REPO.host]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is also failing on my local setup - perhaps due to missing test data? Or is it my local data that's causing the problem?
2) add new repo
Proxy route helpers should return the proxied origin:
AssertionError: expected [ 'gitlab.com', 'github.com' ] to deeply equal [ 'github.com' ]
+ expected - actual
[
- "gitlab.com"
"github.com"
]
at Context.<anonymous> (test/testRepoApi.test.js:266:24)
at Generator.next (<anonymous>)
at fulfilled (test/testRepoApi.test.js:5:58)
at processTicksAndRejections (node:internal/process/task_queues:105:5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thats likely some existing repo pointing to gitlab.com. Try deleting you .data folder and re-running a few times.
Perhaps I should add an expect at the start of the test for there to be no repos with that origin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the extra hosts and it worked 👍🏼
Perhaps I should add an expect at the start of the test for there to be no repos with that origin?
This would be helpful in case people try to run the tests locally!
Looks good! Just a few comments on the failing tests due to pre-existing GitLab origin (as well as other origins). It'd be fantastic if we could fix those to be agnostic of the data in the database. |
Making the tests agnostic of the data would be good, but I haven't figured out how yet - the requests that go out are affected the response from the other end, so we'd probably need to mock something in the proxy, perhaps the URL it would forward a request on to, and then check it had been called and returned the right value? |
@kriswest I've taken another look at the tests, and since the failing ones are "end to end" if anything, it'd be harder than I thought to mock out the database dependency (and it only really makes sense for unit/function tests). As long as the reason for failure is obvious for contributors, that should be enough for now so we can speed up the release. We can make the tests more robust in another issue (#978 and #1143 are related to this). |
Hi, @kriswest. I found culprit of the I dumped the packets, I found that my git client was actually sending the contents. The pattern matching fails in the code below, and therefore the
const isPackPost = (req: Request) =>
req.method === 'POST' &&
// eslint-disable-next-line no-useless-escape
/^\/[^\/]+\/[^\/]+\.git\/(?:git-upload-pack|git-receive-pack)$/.test(req.url);
const teeAndValidate = async (req: Request, res: Response, next: NextFunction) => {
if (!isPackPost(req)) return next();
..
try {
..
(req as any).body = buf; In my case, Upon reviewing the regex, I noticed that there is a change history in the commit below. You may need to revert or modify the regex changes. I would like to share one more thing related to this issue. As I mentioned earlier, our company has a firewall issue that prevents us from pushing to github.com, so I tested it at home. The test environment is different, but when I set up a github.com proxy like this, [remote "proxy"]
url = http://localhost:8000/github.com/kyet/git-proxy.git
fetch = +refs/heads/*:refs/remotes/proxy/*
[remote "proxy2"]
url = http://localhost:8000/kyet/git-proxy.git
fetch = +refs/heads/*:refs/remotes/proxy2/* If I think about it in relation to the above issue, when falling back to the default proxy, I assume that |
resolves #950
resolves #511
resolves #66
resolves #1107
resolves #1028
Refactor (api, proxy & UI) to remove the assumption of GitHub as the git repository host and the use of the repository
name
field as the id of the repository (as this prevents git-proxy instances from supporting multiple forks of a project or projects from multiple hosts with the same name).This PR:
name
field in the API with the _id field generated by the database adaptors,names
to be repeated (multiple forks or clashing names from different organisations/repository hosts)organisation/repoName.git
in the proxy URLs with the repository urlbecomes
https://myGitProxyInstance.com:8443/github.com/finos/git-proxy.git
To Do:
(contributed as part of a GitLab CoCreate collaboration with help from @StingRayZA)